BEST AVAILABLE COPY 



REMARKS 

The claims are 1-20. Claims 1, 3, 4, 7, 14, 16 and 18 have been amended. 
Claims 1, 3, 7, 14, 16 and 18 are in independent form. Favorable reconsideration and 
allowance of the subject application are respectfully requested in view of the following 
comments. 

Claims 1, 3, 7, 14, 16 and 18 have been amended to clarify that the energy bar 
claimed has about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g of fortification 
components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, about 150 to about 
300 calories, and a moisture content of less than about 15% by weight, based on a 55 g 
serving size. Support for the amendment can be found, for example, in paragraph [0016] on 
pages 4-5, and paragraph [0042] on page 12 of the specification. 

Claim 4 has been amended to correct a minor error. 

Claims 1-3, 7, 15, 17, 19 and 20 stand rejected under 35 U.S.C. § 1 12, second 
paragraph, for allegedly being indefinite. Specifically, the Office Action has objected to the 
use of the terms "hedonic score," "confidence level," and "acceptability." Applicants 
respectfully direct the Examiner's attention to paragraphs [0020] and [0022] on pages 5 and 6 
of the specification, where the definitions for "hedonic score" and "confidence level" are 
provided. Moreover, Applicants note that the term "acceptability" is understood in the food 
industry to denote a consumer's willingness to eat a product. See Principles of Sensory 
Evaluation of Food, 1965, p. 278. Applicants also wish to point out that one skilled in the art 
understands that the hedonic score and confidence intervals are statistically determined 
measurements and are reproducible within a certain degree of error. Applicants respectfully 
direct the Examiner's attention to the following publications, which demonstrate the use of 



-8- 



these terms throughout the food industry: Sensory Analysis of Foods, pp. 250, 254-257, and 
366 1 ; Statistical Methods in Food and Consumer Research, pp. 7 and 8; and Principles of 
Sensory Evaluation of Food pp. 275-289. Copies of each are enclosed for the Examiner's 
convenience. Accordingly, Applicants respectfully request withdrawal of the Section 112 
rejections. 

Claims 1-20 have been provisionally rejected under the judicially created 
doctrine of obviousness-type double patenting as being unpatentable over claim 20 of 
copending Application No. 10/272,571 (the '571 application) and claims 1-20 of copending 
Application No. 10/271,710 (the '710 application). Applicants note that the '571 application 
was abandoned on September 15, 2004 for being non-responsive to the Office Action issued 
on June 15, 2004. As such, the provisional rejection based on the '571 application is rendered 
moot. Regarding the provisional rejection based upon the '710 application, a Terminal 
Disclaimer is submitted herewith. In light of the above comments, it is believed that the 
provisional double patenting rejections have been obviated, and their withdrawal is therefore 
respectfully requested. 

Claims 1-10 stand rejected under 35 U.S.C. § 102(b) as allegedly being 
anticipated by U.S. Patent No. 4,055,669 ("Kelly"). Claims 1-13 and 18-20 stand rejected 
under 35 U.S.C. § 103(a) as allegedly being obvious over Kelly in view of U.S. Patent No. 
6,592,915 ("Froseth") and a recipe for Pfeffernusse found in the book titled, Joy of Cooking 
("Rombauer"), on page 708. Claims 14-17 stand rejected under 35 U.S.C. § 102(b) as 
allegedly being anticipated by Rombauer. Applicants respectfully traverse these rejections, in 
view of the comments set forth below. 

1 The hedonic score may be based on a nine-point scale or seven-point scale. For purposes of the present 

-9- 



As amended, claim 1 is directed to an energy food bar that provides about 2 to 
about 55 g of carbohydrates, about 1 to about 4.5 g of fortification components, about 5 to 
about 40 g of protein, about 2 to about 10 g of fat, about 150 to about 300 calories, and a 
moisture content of less than about 15% by weight, based on a 55 g serving size. 

Kelly is directed to a high protein fat occluded food composition made of 
cereal particles and a binder. The binder includes a protein source coated with an edible fat, 
which masks the protein flavor, making the binder taste bland. 

Applicants have reviewed Kelly and have determined that the amount of fat in 
the food composition exceeds the permissible amount set forth in claim 1 of about 2 to about 
10 g of fat, based on a 55 g serving size. In column 2, lines 56-58, Kelly discloses that a 
binder composition makes up 60-70% of the food composition. Kelly further states in column 
3, lines 61-64, that "[t]he fat content of the binder composition ranges from a minimum of 
about 33% by weight to a maximum of about 85% by weight, preferably about 47% by 
weight[.]" Therefore, the minimum amount of fat present in the binder composition of Kelly 
can be calculated by multiplying the (percent binder) by the (percent fat in the binder) by the 
(serving size). For a 55 g serving, the minimum amount of fat present in the binder 
composition alone is 10.9 g of fat (55 g X (33% fat) X (60% binder)). Moreover, additional 
fat in the food composition of Kelly is found in the cereal components that make up the other 
40% of the food composition. Low fat cereal components such as crisp rice or corn flakes 
have about 0.5% fat. For a 55 g serving basis, this would amount to 0.1 g of fat (55 g X (0.5% 
fat) X (40% cereal)) in the cereal portion. The minimum total amount of fat in the food 
composition is therefore calculated to be 1 1 g of fat. This clearly exceeds the range of about 2 

invention, a seven-point scale was selected. 



-10- 



to about 10 grams of fat permitted in the energy food product set forth in claim 1 . As such, it 
is respectfully submitted that claim 1 is patentable over Kelly. 

Claim 2 directly depends from claim 1. For at least the same reasons discussed 
above in connection with claim 1, claim 2 is patentable over Kelly. 

Independent claims 3 and 7, as well their respective dependent claims, require 
that the energy bar have about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g of 
fortification components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, and 
about 150 to about 300 calories, and a moisture content of less than about 15% by weight, 
based on a 55 g serving size. As such, claims 3 and 7 and their respective dependent claims, 
are patentable over Kelly. 

Froseth discloses a layered cereal bar having identifiable ready to eat cereal 
pieces and at least one visible filling layer. The cereal bar has a total nutrient level equal to or 
greater than the nutrient level of a single serving of boxed cereal with milk. 

Froseth, however, does not disclose a cereal bar having about 1 to about 5 g of 
fortification components. In column 15, lines 17-25, Froseth, discloses an embodiment where 
the amount of tricalcium phosphate (TCP), i.e., mineral, in the binder is 3% on a weight basis. 
Froseth also discloses that the binder makes up 40% of the cereal bar {see column 11, lines 
1 5-16). For a 55 g serving basis, the amount of TCP in the cereal bar can be calculated to be 
0.66 g of TCP (55 g X (40% binder) X (3% TCP in binder)). Therefore, the cereal bar of 
Froseth does not fall within the fortification component range of about 1 to about 4.5 grams in 
the energy bar set forth in claim 1 . As such, the cereal bar of Froseth would not qualify as an 
energy bar. 



Rombauer is cited for disclosing a recipe for Pfeffernusse. The Office Action 
states that "an energy matrix made of com syrup which is combined with a solid component, 
grated lemon rind, which is mixed into a fat-carbohydrate matrix (butter and sugar)(page 708). 
The composition is considered to have a lubricious mouthfeel since the claimed ingredients 
are used." 

Applicants note, however, that Rombauer fails to meet the protein level 
required by the range of about 5 to about 40 g, set forth in claim 1 . The table below provides 
a breakdown of the ingredients used to make the Pfeffernusse composition. 



PFEFFERNUSSE 



Grams of Protein 
Ingredient (based on 55 g serving) 


Flour 


2.01 cups 


3.21 


Baking Powder 


0.75 tsp 




Baking Soda 


0.13 tsp 




Salt 


0.25 tsp 




Black Pepper 


0.25 tsp 


0.01 


Nutmeg 


0.25 tsp 


0.01 


Cinnamon 


1 tsp 


0.01 


Fennel Seed 


1 tsp 


0.05 


Butter 


0.5 cups 


0.03 


Sugar 


0.33 cup 




Egg 


1 


0.47 


Chopped Almonds 


0.25 cup 


0.82 


Chopped Citron 


1 tbsp 




Orange Peel 


0.25 cup 




Molasses 


0.33 cup 




Corn Syrup 


1 tbsp 




Brandy 


0.33 cup 




Lemon Rind 


1 tsp 




Lemon Juice 


1 tbsp 




TOTAL 




4.61 



Applicants have determined that the protein content in the Pfeffernusse 
composition is approximately 4.6 g. This does not fall within the protein range of about 5 to 



- 12- 



about 40 g (based on a 55 serving), claimed in claim 1 . Moreover, the Pfeffernusse 
composition is not seen to include fortification components. As such the range of about 1 to 
about 4.5 g of fortification components, set forth in claim 1 is not met. Clearly, the 
Pfeffernusse composition of Rombauer, does not qualify as an energy bar. 

Applicants respectfully submit that Kelly, Froseth, and Rombauer, whether 
taken alone or in any permissible combination, do not disclose or suggest the presently 
claimed invention of an energy bar that provides about 2 to about 55 g of carbohydrates, about 
1 to about 4.5 g of fortification components, about 5 to about 40 g of protein, about 2 to about 
10 g of fat, about 150 to about 300 calories, and a moisture content of less than about 15% by 
weight, based on a 55 g serving size, as set forth in claim 1 . 

Claim 2 directly depends from claim 1 . For at least the same reasons discussed 
above in connection with claim 1, claim 2 is patentable over Kelly, Froseth, and Rombauer 
whether considered alone or in any permissible combination. 

Like claim 1, independent claims 3, 7 and 18 each require that the energy bar 
have about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g of fortification 
components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, and about 150 to 
about 300 calories, and a moisture content of less than about 15% by weight, based on a 55 g 
serving size. For at least the same reasons discussed above for claim 1, claims 3, 7 and 18 are 
patentable over Kelly, Froseth, and Rombauer, whether considered alone or in combination. 

Claim 14 is a product by process claim and claim 18 is a method claim, which 
require that the energy bar have about 2 to about 55 g of carbohydrates, about 1 to about 4.5 g 
of fortification components, about 5 to about 40 g of protein, about 2 to about 10 g of fat, and 

-13- 



about 150 to about 300 calories, and a moisture content of less than about 15% by weight, 
based on a 55 g serving size. 

As previously noted, the Rombauer Pfeffernusse composition has 
approximately 4.6 g of protein (based on a 55 g serving) and no fortification components. 
Therefore the Pfeffernusse composition does not meet the protein level of about 5 g to about 
40 g of protein, and the fortification level of about 1 to about 4.5 g, set forth in claims 14 and 
16. As such, claims 14 and 16 are patentable over Rombauer. 

Claim 15 depends from claim 14, and claim 17 depends from claim 16. Claims 
15 and 17 are also patentable over Rombauer for the same reasons discussed above for claims 
14 and 16. 

In view of the foregoing remarks, Applicants respectfully request favorable 
reconsideration and early passage to issue of the present application. 

Applicants' undersigned attorney may be reached in our New York office by 
telephone at (212) 218-2100. All correspondence should continue to be directed to our below 
listed address. 




Attorney for Applicants 
Victor Tsu 

Registration No. 46,185 



FITZPATRICK, CELLA, HARPER & SCINTO 
30 Rockefeller Plaza 
New York, NY 10112-3801 
Facsimile: (212)218-2200 



NY_MAIN 492806v1 



- 14- 




In re Application of: \&, 

EDWARD L. RAPP ET AL 

Application No.: 10/615,249 

Filed: July 8, 2003 

For: TASTING ENERGY BAR 
(As Amended) 



Docket No. 02280.003720. 

Examiner: H. F. Pratt 
Group Art Unit: 1761 

Date: April 4, 2005 



THE COMMISSIONER FOR PATENTS 
P.O. Box 1450 
Alexandria, VA 22313-1450 

Sir: 

Transmitted herewith is an Amendment and a Terminal Disclaimer in the above-identified application. 



X] No additional fee is required. 



The fee has been calculated as shown below 



c 


LAIMS AS AMENDED 




(2) 
CLAIMS 
REMAINING 

AFTER 
AMENDMENT 




(4) 

HIGHEST NO. 
PREVIOUSLY 
PAID FOR 


(5) 
PRESENT 
EXTRA 


RATE 


ADDITIONAL 
FEE 


TOTAL 
CLAIMS 


* 

20 


MINUS 


** 

20 


0 


x$25 
$50 


0.00 


ESfDEP. 
CLAIMS 


* 

6 


MINUS 


*** 

6 


0 


x$100 
$200 


0.00 


Fee for Multiple Dependent claims $180°/$360 




TOTAL ADDITIONAL FEE 
FOR THIS AMENDMENT— 




0.00 



* If the entry in Column 2 is less than the entry in Column 4, write "0" in Column 5. 
** If the "Highest Number Previously Paid For" IN THIS SPACE is less than 20, write "20" in this space. 
*** If the "Highest Number Previously Paid For" IN THIS SPACE is less than 3, write "3" in this space. 



Page 1 of 2 



I I Verified Statement claiming small entity status is enclosed, if not filed previously. 
I I A check in the amount of $ is enclosed. 

I I Charge $ to Deposit Account No. 06-1205. A duplicate copy of this sheet is enclosed. 

[~X] Any prior general authorization to charge an issue fee under 37 C.F.R. 1.18 to Deposit Account No. 06- 
1205 is hereby revoked. The Commissioner is hereby authorized to charge any additional fees under 
37 C.F.R. 1.16 and 1.17 which may be required during the entire pendency of this application, or to 
credit any overpayment, to Deposit Account No. 06-1205. A duplicate copy of this paper is enclosed. 

I 1 A check in the amount of $ to cover the fee for a month extension is 

enclosed. 



X] A check in the amount of $ 130.00 to cover the Terminal Disclaimer fee is enclosed. 



Applicants 1 undersigned attorney maybe reached in our New York office by telephone at (212) 21 8- 
2100. All correspondence should continue to be directed to our address given below. 




Attorney for Applicants 
Registration No.: 46,185 



FITZPATRICK, CELLA, HARPER & SCINTO 
30 Rockefeller Plaza 
New York, New York 101 12-3800 
Facsimile: (212)218-2200 



Form #120 

NY_MAIN 492639v1 



Page 2 of 2 



:hnology 



hs 



C. O. CHICHESTER 

University of California 
Davis, California 

G. F. STEWAKT 

(/«itwjr.v 1,1/ of California 
Davis, California 



ft, Rocsshrr, Pmncipi.ks 



PRINCIPLES OF 

SENSORY EVALUATION 
OF FOOD 

hy 

Maynard A. Amcrim 
Rose Marie Panghorn 
Edward B. RoessUr 

VKI'AlltMKNTH OF ViriCOLYUIUf AND KN01-0<JY, 
FOOD fiClKNCB AK)> TECHNOLOGY, AND MATJtK**AT£Gfi f 
VNIVKM1XY OK OALU'OnNfA, 
OAV'J.S, CAMFOHNIA 



1965 




ACADEMIC PRESS New York and London. 



Food SCfttTlCti deals V 

riding food for human < 
harvesting lo serving. 1 
involvci biochemistry, ml 
basic sciences, as well a 
other applied sciences. 
been primarily on cconc 
nutritious foods. Univoi 
world have concerned tj 
nutritive composition, r j 
Honal properties of foO'T 
World War II focus J 
thai foods were somctiiy, 
how sound and mitrilio 
gradually changed tjioj 
now and cheaper mctluj 
quendy altered the son 
phasized the; growing n 
. — ibc^ sensory analysis < 
reveals the rapid grow 
natural that, in 1957, t 
an tipper-di vision tours 
foods' by .sensory mcth< 
course. 

Our: philosophy has 
analysis of foods rests < 
and an understanding j 
addition is careful stntj 
now understanding oC ?-| 
lion with physical and| 
This text therefore } 
ehology of the senses, ; 
oJogy, and appropriate 
o[ measuring consumer 
chide a brief treatment 
istics and various phys, 
belief that objective U 
subjective methods nso< 
food accept mice and p 
so it is imperative that 
We wish to thank, 
rRtN-riw JiN the iiNrnm statks oh ammuca. 



AM.. UKSin-S UK8R»VI(V>. 

NO 1'AUT <>K HUM HOOK MAY 1JH UKI'llOmiCKI) JN ANY FOMM, 

»v rirtwwTA-r, Mic:uoni,M, on />ny othkh mkans, without 
wutvkn iwiimiotich* yaoM Tii k noni.isuKus, 



ACADKMIC riusas INC, 

Ml FiHU Avenue, New York, NW Yflrtc 10001* 



l/iitorf Khxadoi" Edition puhlUhcd by 
ACADKMIC WUiSS INC. (LONDON) f /TO. 
JU'vkflcy Sijuaru Houst', T*oiulon W.l 



LmuAXxv or GoKcnra Catai/ih Cah» Numhkii: 65*8370(1 



am 



gustation, jf. Exptl. 
vc. presentation in 
100 pp. Mnomfllan 

4, 153-181. 
Psychol. Ball, 55, 

5, 600-621. 

if listen scales, Jn 
kson and -S, M<;s- 

:<? 133, 80-80. 
Jit "Sensory Com- 
■33). M.1.T-. Proa, 

>i1or intensity, :U.2 
Sc<: also /, iTsx;^^. 

of otlor difference 

"Food Accepts j ice 
Braird on Quartos 
Nail, Ac:.'i<3. Sei., 

od., 526 pp. (sco 

:»d of sucrose and 
>n.s, Food Tvchiwl. 

;!d' of .sucroso and 
r.s. FrwrZ Tech not. 

c Dctforminants of 
as determined l>y 



Chapter 6 
Laboratory $tudies:Typc$ and Principles 



Foods arc submitted to sensory examination to provide information 
thai can load to product improvement quality maintenance, the de- 
veloprncnt of new products, or analysis of the market. This suction 
summart^ns the most important types of sensory problems encountered 
by food research groups and the main types of procedures used in 
solving thorn. This chapter covers the use? of laboratory panels, as do 
Chapters 7 mid 8. Consumer testing is discussed in Chapter 9, and sta- 
tistical procedures for evaluation of the results of both types of panels 
are covered in Chapter 10. 

Tests may be conducted to: (1) select qualified judges and study 
human perception of food attributes; (2) correlate sensory with chemical 
and physical measurements; (3) study processing effects, maintain qual- 
ity, evaluate raw material selection^ establish storage stability, or reduce 
costs; (4) evaluate quality; or (5) determine consumer reaction. Each 
of these purposes requires appropriate tests. In general, laboratory 
panels arc used for the first three purposes, highly trained experts for 
the fourth, and largo consumer groups for the last. 

In this text we distinguish between two types of laboratory panels: 
(1) those which determine simple differences between treated samples; 
and (2) those which determine directional differences. Both arcs lab- 
oratory panels, and sometimes untrained judges arc used, but it. is the 
thesis of this book that trained subjects are more useful. The advan- 
tages of such panels arc discussed in Chapter 7. 

I. Types of Tests 

The most important types of tests and their utilization are briefly 
described here. More detailed information of each procedure is given 
in Chapters 7, 8, and 10. 

A. DjOTJCUKNCti Tk$tb 

The common true difference tests arc referred to as single-stimulus, 
paired-stimuli, duo-trio, triangle, and multi-sample tests, In tests which 

275 



276 



6. LABORATORY STUDIES 



do not reveal statistically significant differences between treatments, no 
further cvuhiution is needed. When differences are found, however, direc- 
tional difference tests arc used to establish the nature and magnitude of 
difference. After a significant difference has been established by a 
laboratory panel, consumers may he asked to express preferences. 

Since most perceptual judgments are relative, single-sample prcscnta* 
Hon is used infrequently, except at the consumer level. Expert tasters of 
wines, beers, coffee, tea, and dairy products rate single samples, but they 
evaluate the quality of many samples at a time and compare them 
against their pre-established "memory standard/' Occasionally a method 
called "A-uot A'* Is used (Peryam, 1958), in which a standard, A, is 
presented followed by one or more coded samples. The judge indicates 
which onc(s) /.v (are) A. This method may be classified as a paired com- 
parison rather than single presentation since each coded sample is com- 
pared with the standard. 

In the paired-stimuli procedure, judges simply specify whether there 
is a difference between two samples. When the judge also indicates what 
sensory characteristic distinguishes the two samples, we speak of the test 
as a paired-comparison. The samples are presented in a counter-balanced 
design, and a forced-choice is usually required. One half of the responses 
could be correct due to chance alone. The number of samples tested at a 
single session will depend on the commodity, the experience of the 
judges, and the amount of time and sample Available. Paired testing is 
typically used in comparing now with old processing procedures, in 
quality control, and in preference testing at the consumer level. 

The duo-trio is a modified paired presentation in which one sample is 
identified and presented first, followed by two coded samples, one of 
which is identical with the standard The Judge is asked which of the two 
is the same as the first sample. This method is primarily a laboratory tool 
for use with trained subjects. It lends itself to use for quality control and 
for selection of judges of superior discrimination. 

In the triangle tast two identical and one different samples are pre- 
sented simultaneously and the judge is asked to indicate the odd sample. 
Correct identification due to chance alone is one third, Like the duo-trio 
method, the triangle test should be used only by trained laboratory 
judges, and is suited to similar problems. 

B. Rank Order 

Ranking is used to determine how several samples differ on the basis 
of a single characteristic. A group of coded samples (which may contain 
a control, or standard) arc presented simultaneously, and the judge is 
asked to rank them in order of the intensity of a specified characteristic 



This method is suitable 
evaluation, by experts 
and by consumers {or 
number of samples, ] 
criteria. When nccossr 
ranked, after which a 
ranked in another set 

C. S(.xmiN Y o Tests 

The best use of « 
with several expcrim< 
terms of deviation fr 
"very large difference 
used on an absolute '}. 
by all judges, AHhoj! 
widely by laboratory \ 
change the basis of tli 
experts. Thus, this 1 mo 
bo administered to cd, 
required arc simple. 

Tests in which d< 
product or process & 
sensory attributes. Sc 
ment, quality control, 
and measuring judge 
central tendency (see 

D. DiZSCIUFTiVlv. TEST5 

Descriptive senso 
trained experts compl 
tests arc used effect 
process improvement 
future testing. One tv 
liking is described, J;j[ 
descriptive tests curl 
hedonic ratings, semis 
Profile" (see Chapter 

E. Hedonic Sgaxjng 

Scoring is called 
liking by cheeking a 
to extreme approval. 



I. Types of Teste 



277 



treatments, no 
however, rlircc* 
i magnitude* of 
.ablishcd by a 
ofcrences. 
tmple prcscnta- 
xpert tasters of 
mples, but they 
compare thorn 
nally a method 
standard, A, is 
judge indicates 
i a paired corn- 
sample is com- 

whether there 
indicates what 
>cak of the test 
•untcr-balnnced 
*C (]ic responses 
pies tested at » 
erience of the 
aired testing is 
procedures, in 
• level. 

l one sample is 
imples, one of 
I rich of the two 
laboratory tool 
ity control and 

triples are pre- 
ie odd sample, 
ke the duo-trio 
■led laboralorv 



er on the basis 
:h may contain 
d the judge is 
characteristic. 



This method is suitable for use hy laboratory judges in product or process 
evaluation, by experts for selecting the best sample for a particular use, 
and by consumers for expressing relative acceptability among a limited 
number of samples. It is of importance that all judges use the same 
criteria. When necessary, one criterion (sweetness, for example) can be 
ranked, after which another criterion (sourness, viscosity, etc,) may be 
ranked in another set of the same samples. 

C. Sowing Trots 

The best use of scoring tests is in comparisons of a control sample 
with several experimental samples. The scoring may be expressed in 
terms of ^deviation from a rcfcrcncc—"no difference from control" to 
4 very large difference from control/' In oilier experiments, scores may be 
used on an absolute basis if the scale is clearly defined and understood 
by all judges. Although dJfferenee-from-control tests have been used 
widely by laboratory panels, the results may be meaningless if the judges 
change the basis of their scoring as the test proceeds, i.e., judges become 
experts. Thus, this method is best suited for use by experts. The test may 
be administered to consumers if it is clearly explained and the decisions 
required arc simple* 

Tests in which deviation from a control is measured are used for 
product or process evaluation and critical tests on basic perception of 
sensory attributes. Scoring tests arc* also used in new-product develop- 
rnent, quality control, storage stability tests, screening of intensity levels, 
and measuring judge characteristics such as leniency, reproducibility, or 
central tendency (see Chapter 5). 

D. Descriptive Tests 

Descriptive sensory analyses are best conducted only by highly 
trained experts completely familiar with the product or the process. Such 
tests are used effectively in new-product development, in product or 
process improvement, for quality control, and for training judges for 
future testing. One type of descriptive test— bedonic— in which degree of 
liking is described, is suitable at the consumer level. Among the types of 
descriptive tests currently in use are scalar scoring of various types, 
hedonic ratings, semantic differential tests, and Arthur P, Littles "Flavor 
Profile" (.sec Chapter 8, Section V), 

E. Hkdonxc Scaling 

Scoring is called hedonic when the judge expresses bis degree of 
liking by checking u point; on a scale ranging from extreme disapproval 
to extreme approval. A five- to nine-point balanced scale to usually cm- 



278 



6. LAUOItATOHY SXUPJKS 



ployed. Hodonlc ratings arc converted to scores and treated by rank 
analysis or analysis of variance. As indicated above, this test has been 
used both by experts and by untrained consumers, but we feel it is more 
effectively applicable to rim latter. 

P. Acceptance: and Pmsferenck 

Distinction should be made between acceptance, which is a willing- 
ness' to use or eat a product, and preference, which relates to a greater 
degree of acceptance of one product over another when a choice is 
presented. The acceptance or preferences of a laboratory panel arc of 
very limited value except in gross screening of treatments. Some of the 
test methods described above can be adapted to measurement of con- 
sumer reaction (sec Chapter 9). 

G. 0-jtiku MlOTJJOPS 

Dilution tests, described in Chapter 9, have been used for laboratory 
testing of selected treatments, employing methods of presentation de- 
scribed above, i.e., single, paired, and multiple samples, Threshold tests 
are seldom used except in studies when* it b desirable to establish the 
minimum delectable difference of an additive or of an off flavor. Thresh- 
old and dilution tests have* I Keen used to a limited extent to select judges 
who can detect specific sensory properties. When so used, the test mate- 
rials and their concentrations should be the same as those likely to be 
encountered in the actual test. Sequential analysis (Chapter .1.0) can 
be used to analyse the results. 

It is our belief that laboratory judges should be carefully selected arid 
screened on the basis of their sensitivity to the differences that may bo 
encountered in the experimental samples. In this .sense, all laboratory 
panels should consist of experts. It is recognized that in many organisa- 
tions the time, money, and personnel necessary to achieve this goal are 
unavailable, but. unless judges have had extensive training and experi- 
ence, they should not: be expected to make meaningful evaluations of 
quality, particularly of a descriptive nature. Neither should a laboratory 
panel, whether small or large, experienced or inexperienced, presume to 
predict consumer acceptance or preference. Preferences of a laboratory 
. group »«* representative only of a limited and unknown portion of the 
consuming public. This concept is discussed in considerable detail in 
Chapter 9. 

II. Panel Selection and Testing Environment 
Systematic analysis of the sensory properties of foods involves the use 
of human subjects in a laboratory environment. The sensitivity and re- 



//. Panel Sek 

producibility of the analytj 
fiuence the direction and v. 
which the judgments are ol 
importance are the time ar 
volvcd, for these factors m 
We agree with Foster (19 
controlling physical and p 
foods. Unfortunately, the d 
are not adequate for the V 
variables. 

"i 

A. Panel Selection 

There is considcx*ahle c 
sensory panel that has beer] 
has arisen because di scrim i 
tinguished from quality or 
failure to find differences be 
to discriminate has hnd. its 
cieucics. Tarvcr and Ellis (; 
important in selecting jndg< 
inherent ability to duplicate 
senee of bias in detecting a 
inherent sensitivity to a par! 
el al (1961), if the simula 
trained panel is not needed 
be important to select ind 
detect differences. It is diffj 
of knowledge of consumer J 
agreement with consumer r 
inability to define the diffoi 
difference. Furthermore, tlx.' 
sory in evaluating foods. 

Various procedures, bas'v 
mentation, have been appli* 
sensory tests will be superi< 
r<on et al, 1963). These md 
of success. One major probL 
to establish reliable sclccti 
mentor's inability to specify 
task. "Quickie" methods of 
have generally not been ve 
the tedious process of select 



II. Pond Selection and Testing Environment 



279 



nd treated, by rank 
:> this test has been 
u t we fed it is more 



, which is a willing-, 
relates to a greater 
r when a choice is* 
notary panel arc of 
hmcnls. Some of the 
icasiuornont of con- 



j used for laboratory 
of presentation d.e- 
plcs. Threshold tests 
able to establish the 
:in off flavor. Thresh- 
<tcnt to select judges 
used, the tost mate- 
as those likely to bo 
s (Chapter 10) can 

sarcfully selected arid 
: crcnccs that may be 
souse, all laboratory 
at in many organiza- 
achiovo this goal are 
training and export- 
ingful evaluations of 
r should a laboratory 
■cricnccd, presume to 
slices of a laboratory 
oiown portion of the 
onsidcrablo detail in 



ronmcnt 

•oods involves the use 
he sensitivity and re- 



producibility of the analytical tool (in this ease, the judge) greatly in- 
fluence the direction and validity of the results. The environment under 
which the judgments are obtained also influences the data. Of additional 
importance arc the lime and labor and the supplies and equipment in- 
volved, for these factors materially control the cost of sensory analyses. 
Wo agree with Foster (1954) that more emphasis must be placed on 
controlling physical and psychological influences in sensory testing of 
foods. Unfortunately, the data available for a wide variety of food types 
are not adequate for the determination of the optimum ranges for all 
variables. 

A. Fankc Selection 

There is considerable controversy in the literature on the value of a 
sensory panel that has been selected and trained. Much of the confusion 
has arisen because discrimination or difference tests have not been dis- 
tinguished from quality or consumer types of studies. In some eases a 
failure to find differences between trained and untrained panels in ability 
to discninitiate has had its origin in methodological or statistical defi- 
ciencies. Tarvcr and Ellis (1961) believe the following considerations arc' 
important in selecting judges for flavor-difference tests: (1) precision or 
inherent ability to duplicate a difference judgment; (2) reliability or ab- 
sence of bias in detecting a flavor difference; and (3) a tolerance level or 
inherent sensitivity to a particular flavor difference. According to Kramer 
ct al (1961), if the simulation of consumer reaction is the sole aim, a 
trained panel is not needed and should be avoided. In some cases it may 
be important to select individuals who are superior in their ability to 
detect differences. It is difficult, if not impossible, with our present lack 
of knowledge of consumer response, to select panels that will show good 
agreement with consumer evaluation. The problem seems to be our 
inability to define the difference and to train the panel to recognize the 
difference. Furthermore, the consumer uses many criteria other than sen- 
sory in evaluating foods. 

Various procedures, based on intuition, rational judgment, or experi- 
mentation, have been applied in selecting people whose performance in 
sensory tests will be superior to that of an unselected population (Daw- 
son etal, 1063). These methods have been tested with varying degrees 
of success. One major problem is the amount of pretesting work required 
to establish reliable selection. A further difficulty may be an experi- 
menter's inability to specify accurately the nature of the pane) member's 
task. "Quickie" methods of panel selection, basal upon only a few tests, 
have generally not been very .satisfactory. On the other hand, although 
the tedious process of .selecting subjects on the basis of sensitivity to the 



230 6. I/AUORATORY STUDIES 



basic tastes is often recommended, the method is of doubtful value 
(Mflckoy and Jones, 1954; Pcryarn, 1958). 

Since randomly selected and untrained individuals arc variable in 
their judgments, large panels arc needed for results that arts stable and 
sensitive. By selecting tbc most stable and sensitive members and train- 
ing them, one might expect to obtain a small but efficient panel Selec- 
tion in important since individuals differ considerably in sensitivity, in- 
terest, motivation, and ability to judge differences. Discriminatory skill 
need not be general; a good wine taster may not be a good judge of 
chocolates. Girardot et al (1052) found that candidates who did well on 
some products often did poorly on others. Seldom is a judge equally 
proficient in testing all qualities and all flavors of foods. The skill of a 
connoisseur has been attributed to knowledge of what signs to look for 
and how to interpret them rather than to increased sensitivity to stimuli 
(Mobsncr, 1043), An ability or aptitude for flavor assessment could con- 
ceivably vary in three ways: between individuals, between products, and 
at different times for the same individuals and products (see Coppock 
at al.y 1952; Harvey, 1953), Thus it is evident that a general-purpose 
panel will be less useful than a specific panel selected for the product 
and method being tested, A general-purpose panel could be used for 
gross screening, however, when precision must be sacrificed to save time 
and expense. A sensory panel should be considered as a tool, and, as 
such, it can be compared to suitable chemical methods (Lowe and 
Stewart, 1947). Certain methods and tools may he used to show gross 
differences, but, as the measurements needed become more refined and 
precise, the methods and tools required for accurate sensory testing be- 
come more sensitive:. 

Moscr et al (1950) considered that selection and training of judges 
on the basis of sensitivities and consistencies are of extreme importance 
in evaluating edible oils. In selecting panels, those investigators used a 
double elimination test (see Chapter 6, Section II,C) based on acuity in 
oil evaluation. In scoring bitterness in orange juice, Coote (1956) illus- 
trated the necessity of careful training and selection of panels for esti- 
mating the degree of bitterness. For beer-tasting tests, Helm and Trolle 
(1946) selected 20 out of 90 prospective judges. Those 20 had the highest 
percentages of correct selections in triangle tests and were considered to 
compose a far more suitable taste panel than the original group. Kirk- 
patrick el al (1957) showed the importance of panel selection for evalu- 
ation of milk and biscuits. 

Any method of selection should include a preliminary training period 
designed to acquaint the tasters with the quality factors involved in the 
product to be tested. This should be followed by a blind test designed to 



//. Vane 

show the individual's 
and. Elder, 1950). ; 

B. SCRUKNZNG 

Most investigators 
ing panel members, ir; 
differences between s; 
tfon; (2) ability to rc/j 
parison with other paf 
enccs in samples to b 
the extent to which s< 
ancc in actual tests. i: 

Kramer et al (19 
ficiont for selecting pc 
tccting flavor different 
who performed best o 
average of the origina 
more efficient group. I 
have resulted in a still 

A general approach 
as tost materials the s; 
tests to obtain variatic 
met with in the aclua' 
so that the group as a * 
individuals will fail; (l 
later; (5) start with a.'i 
a selection test that iV? 
quired; (6) screen oirr 
a top-ranking group oi 
at each stage reject tin 
people than will bo re? 
tine task; it requires ju' 
criteria of achievemcn 
selection. According tc 
lion assumes a good y. 
be perfect. 

It is felt that a pens 
the skill he has develop 
he may note and dctec 
enced judge. He can < : 
and usually has a bcr 
einploycd. ; 

i 



II. Panel Helvetian and Testing Enoironment 



281 



ubtful value 

: variables in 

0 stable and 
rs and train- 
panel. Scleo 
;iKSitivity, in- 
)inatory skill 
>od judge of 
> did well cm 
idgc equally 
be skill of a 
s to look For 
ty to stimuli 
it could con- 
iroducts, and 
>cc Coppock 
icral-puvposc 

(be product 
bo used Foi- 
lo save time; 
tool, and, us 
(Lowe and 
» show gross 
refined and 
y testing bei- 
ng of judges 
j importance 
a tors used a 
ou acuity in 
(1958) illus- 
tiels for ftsti" 
n and Trollc 

1 the highest 
.onsidcred to 
group. Klrk- 
.in for evalu- 

lining period 
•olved in the 
• designed to 



show the individuals relative perception and discrimination (Harrison 
and Elder, 1950). 

13. SCttKKNING 

Most investigators employ some type of .screening process for select- 
ing panel members, including specific tests based on: (1) discriminating 
differences between solutions or substances of known chemical composi- 
tion; (2) ability to recognize flavors or odors; (3) performance in com- 
parison with other panel members; and (4) ability to discriminate differ- 
ences in .samples to be used latex in the tart. The pertinent question is 
the extent to which selection devices arc reflected in superior perform- 
ance in actual tests. 

Kramer et al (1961) reported that a single screening was insuf- 
ficient for selecting panel members of continued superior ability in de- 
tecting flavor differences. After a first screening of 28 candidates, the 12 
who performed best originally did not perform more efficiently than the 
average of the original 28 candidates. A second screening resulted in a 
more efficient group. Further screening and training would undoubtedly 
have resulted in a still more efficient pane). 

A general approach may be summarized, stepwise, as follows: (1) use 
as test materials the same product that will be tested laler; (2) prepare 
tests to obtain variations in the product similar to those which will be 
met with in the actual experiment; (3J adjust the difficulties of the test 
so that the group as a whole will discriminate between samples but some 
individuals will fail; (4) use test forms similar to those to be employed 
later; (5) start with as large a group of candidates as is feasible and with 
a selection tost that is operationally simple if more than one stage is ro^ 
quired; (6) screen on the basis of relative achievement, continuing until 
a top-ranking group of the size desired may be reliably selected; and (7) 
at each stage reject those who are obviously inadequate, but retain more 
people than will be required for the panel This procedure is not a rou- 
tine task; it requires judgment by the experimenter, particularly as to the 
criteria of achievement and as to how much data are needed for valid 
selection. According to Gimrdot et al (1952), the multiple-stage selec- 
tion assumes a good positive correlation between skills, but it will not 
be perfect. 

It is felt that a person with previous experience might utilize some of 
the skill he has developed from a knowledge of techniques. Furthermore 
he may note and detect differences which are unheeded by the inexperi- 
enced judge. lie can often describe the sensory impressions more fully 
and usually has a better understanding of the particular terminology 
employed. 




282 



6. 1,/YftORATOBY sTumr-s 



It would, however, be impossible to test independently for all of the 
characteristics or skills which may determine nchievcrnc^t. ChnsUc 
0956) behoves it is not necessary. Various factors underline a unitary 
skill and they may be separated analytically, but in any given sensory 
test most of them will operate together. Realistic lest situations may be 
set up to include acts of discrimination and judgment such as will be 
used later in definite experiments. Such tests will give each rdevan 
factor its proper weight, so relative performance will ho an adequate 
criterion for selecting the most useful panel members. 

For selecting judges, Krum (1035) and Baker (1962) suggested that 
candidates fill out a questionnaire covering the following items; experi- 
ence, availability, age, sex, health, smoking habits, quantity of particular 
foods habitually consumed, food prejudices, and asthmatic, pbysio- 
cardiae, and respiratory behavior. It is doubtful whether this information 
will be of great value; conclusive evidence against the influence of some 
of these factors on perception has been noted in Chapters 2 ami * 
Baker's (1962) suggestion is interesting— that individuals with a physio* 
cardiac or asthmatic condition might be useful for certain panels since 
they seem to have lower thresholds for air pollutants-bnt the psychic 
attitudes of such individuals might be so unfavorable as to interfere with 

Krum (1955) wrote: "It is believed that sensory ability decreases with 
aoe and that preferences change also" Therefore, he indicated panel 
numbers should be between the ages of 20 and 50. The limiting factors 
are lack of experience in younger people and loss of perceptual ability m 
the older group. Panel members should be in good health and not physi- 
cally fatigued or worried. They should not be overly susceptible to mouth 
and sinus infections or have frequent head colds. Persons should ho ehm> 
nated who are allergic to the materials to be tested For convenience and 
more accurate judging, Krum would eliminate all who do not like or 
refuse to cat a particular product. According to Overman and Jerome 
I 1948) the members of the panel are frequently selected for their inter- 
est or their availability rather than for the acuity of their senses of taste 
and smell. In too many studios we have to "make do with the available 
subjects. 

C. Sensitivity Titfrs 

in this section we discuss the many procedures that have been em- 
ployed In general, the screening tests use discrimination between solu- 
tions of known chemical composition for taste, ability to recognize odors, 
on-the-job performance in comparison with experienced panel members, 
and ability to discriminate actual differences that will be found m the 



11. 

samples to be used 7 
dictate which, if any,; 

For general panel; 
group as outlined by? 
are eliminated prima? 
attributes involved, &1 
recovery from stimu; 
second stage tho scri 
and use stable subjec 
who will do poorly b. 
in advance those wh 
experiment. 

Threshold tests hi 
crs. This procedure i 
sensitivity to the prii 
in foods. At most it 
King (1937) and He 
between individuals, 
can be demonstrate* 
responses. Hall ctf at 
taste and flavors on'? 
lowest concentratioij 
(1959) used ability 
selecting a panel f| 
used by Tarver el a?i 
tolerance level — tho 
(or precision) must, 
son. Hall et al (19| 
Hnguishjng the odd 
correlation with the 
Mackey and Jon 
olds for primary ta: 
scries in the order 
range, in proper ore 
different levels of tl 
and foods could be 
was not highly con 
Further, a high sci 
arrange foods in cj 
bilivy among the ]\\ 
Similar conclusj 
relation between '| 



IL Panel Selection and Testing Environment 



283 



ntly for all 'of the 
rvemcnt. Christie 
ldcrlinc a unitary 
my given sensory 
situations may be 
t such as will be 
ive each relevant 
I be an adequate 

2) suggested that 
ing items; experi- 
ntily of particular 
sthmatie, physio- 
t this information 
influence of some 
hapters 2 and. 3. 
als with a physio- 
rtain panels since 
—but the psychic 
* to interfere with 

ity decreases with 
: indicated, pane] 
ic limiting factors 
recptnal ability in 
Ith and not physi- 
iccplible to mouth 
is should be climi- 
•: convenience and 
to do not like or 
rmnn and Jerome 
ed for their inter- 
on: senses of taste 
with the available 



at have been em- 
ion between soh> 

0 recognize odors, 
d panel members, 

1 be found in the 



samples to be used later in the tests. The experimental situation will 
dictate which, if: any, of these should be used. 

For general panel selection wo recommend that of the. Quartermaster 
group as outlined by Girardot at al (1952). In the first stage, candidates 
arc eliminated primarily on the basis of lack of sensitivity to the sensory 
attributes involved, and to a lesser extent because of poor memory, slow 
recovery from stimulation, and failure to understand the test. In the 
second stage the screening is done on the basis of ability to establish 
and use stable subjective criteria. This double testing screens out those 
who will do poorly because of lack of motivation, but it docs not identify 
in advance those who may lose interest during the course of a lengthy 
experiment. 

Threshold tests have been used as a basis of screening by many work- 
ers. This procedure is seldom justified since there is little evidence that 
sensitivity to the primary tastes is related to ability to detect differences 
m foods. AC most it is only a single factor in discriminatory ability. As 
King (1937) and Hopkins (1954) demonstrated, thresholds vary greatly 
between individuals and, except in extreme cases, no consistent relation 
can be demonstrated between taste acuity and palatability and judges* 
responses. Hall et al (1959) determined the thresholds of candidates for 
taste and flavors on two different days, and selected those sensitive to the 
lowest concentrations who could duplicate their sensitivity. Hanson, et al 
(1959) used ability to detect fijll -strength and dilute chicken broth in 
selecting a panel for studying chicken flavor. A similar approach was 
used by Tarver et al ( 1959), who determined for each judge a bitterness 
tolerance level — the recognition threshold for bitterness. Repeatability 
(or precision) must also he determined by stftfldard-to-stnndard compari- 
son. Hall et al (1959), using that procedure, found that success Jn dis- 
tinguishing the odd sample in triangular testing of beers showed a good 
correlation with the bitterness tolerance level. 

Mackcy and Jones (1954) tested 22 individuals to determine thresh- 
olds for primary tastes in water solutions and their ability to arrange a 
scries in the order of concentration. Also tested was their ability to ar- 
range, in proper order, applesauce, pumpkin, and mayonnaise containing 
different levels of these same taste constituents. Both the water solutions 
and foods could be so arranged— but the ability to arrange one properly 
was not highly correlated with the ability to arrange the other properly. 
Further, a high sensitivity did not correlate significantly with ability to 
arrange foods in order of concentration of taste substances. The varia- 
bility among the judges was high. This experiment should be repeated. 

Similar conclusions were reached by King (1937), who found no cor- 
relation between excellence in judging pure .solutions and ability to rate 



284 



6. LABORATORY fffUDIlS 



correctly samples of bread containing various quantities of sodium chlo- 
ride, sucrose), lactic acid, and caffeine. He nevertheless suggested that the 
ability to identify the basic tastes at low concentration was valuable. 
Hopkins (1946) found a low but significant correlation between judges' 
ratings and the actual salt content of beef, Moreover, Krum (1955) also 
proposed that preliminary selection be based on sensitivity to the four 
primary tastes, From the results of such tests lie would eliminate those 
who had low sensitivity, Knowles and Johnson (1941) classified judges 
on the basis of their sensitivity to the primary tastes but found no cor- 
relation between ability to identify the primary tastes and experience in 
judging foods. Sec also repeatability estimates of Sawyer at al (1962). 

Various selection tests were given to prospective panel members by 
Pfatfrnann and Schlosberg (1952-1953), including: (1) a questionnaire 
designed to reveal habits, preferences, and interest in eating and drink- 
ing; (2) an odor recognition test consisting of 20 common odorous sub- 
stances thought to measure interest in odors; (3) a low-odor recognition 
scries approaching a threshold test; (4) a graded series of solutions to 
determine thresholds for the four primary tastes—salt, sweet, sour, and 
bitter j (5) use of the Ebberg blast-injection techniques to determine 
threshold for oil of wintergreen, to detect gross departures from normal 
sensitivity, as from nasal obstruction; and (G) sixteen duo-trio tests on 
mayonnaise and thirty on an orange drink. The Jesuits failed to reveal 
clear evidence that any item on the questionnaire predicted performance 
in flavor discrimination. Selection scores on the battery of analytical tests 
described did not correlate well, with the performance scores. The relia- 
bility coefficient (between test and vetest) and the validity coefficient 
were very low. 0 Most noticeable was the rather unstable performance of 
the panel members for short-term work. No general clear-cut panel 
ability was evident, so that prediction of a given individuals later per- 
formance would be difficult. Those workers believe, however, that predic- 
tion of the relative ability of panel members is possible. They reported 
that, with the three panels tested, the score on a single discrimination 
session indicated who would do better on later tests: those who scored in 
the upper half of the total group. It is a gross measure, however, and its 
use might eliminate some persons who would be good performers. 

° Tlx; words reliability and validity along with such terms as precision accuracy, 
And relevance fire often interpreted differently. A method oF estimation which, on 
the average, gives the true value i* called an unbiased method. Unbiased estimates 
nre sometimes termed accurate or valid. The precision of a method refers to re- 
peatability and is the ability of the method hi produce estimates winch are very 
dose together (even if it is a biased method and is not Actually measuring the true 
value). Thus accuracy (or validity) is related to lack of bias and precision to 
standard deviation. 



Discrimination \ 
to which the indivj' 
communicate this 
criminability arc: \ 
(2) the consis tones 
or difference lwjtw. 
of its complexity j 
method of commuY 
Any conclusion on 
by the investigate? 
required, Morse r 
judge to be deelar 
incuts between oq 
less than 5% of sun 

Many workers 
VI, A) tcs ts ^>r pa; 1 
for establishing Im- 
paired tests with { 
sistencv of the jul 
and Elder (1950) ! 
of three sets of pa- 
(hen ranked in do,< 
pairings, and onljj 
paired tests with i 
selection criterion, 
terns provides a m 
tivitjes can be dose 
also be used for c 
or weck-to-woek b 

The most cornn 
tcr 7, Section VI,1 
and Helm and Tr< 
known differences 
cult tCStS, Only th 
tests were used tv; 
(Girardor ci al, X; 
Simple tests wore*; 
culty. The judges § 
judgments. All jut] 
of difficulty. Only' 

Bradley 
judges. Sequential 



II. Panel Selection and Testing Environment 



285 



nm eblo- 

that the 
finable!. 
i judges' 
65) also 
(he four 
:tc those 
:1 judges 

no cor- 
ience in 
[1982). 
ibers by 
ionnairo 
d drink* 
his sub- 
ogttilion 
.lions to 
>ur, and 
.•tannine 

normal 
tests on 
:> reveal 
>rmanco 
ual tests 
ie relift- 
efRcicnt 
:iincc of 
t panel 
tor per- 

prcdic* 
cportcd 
tination 
:orcd in 

and Its 



icoumcy, 
hich, on 
?s I ironies 
s to ro- 
il rc very 
the true 
rfafon to 



Discrimination was measured by Morse (1954) in terms of tbc degree 
to which the individual or group can distinguish between two stimuli and 
eommunfente this distinction to investigators, Factors winch affect dis« 
criminability are: (1) the individual's taste acuity at the time of the test; 
(2) the consistency or stability of this ability with time; (3) the distance 
or difference between the stimuli; (4) the design of the test, especially 
of its complexity and the premium it places on memory; and (5) the 
method of communicating the results from the subject to the investigator. 
Any conclusion on diseriminahility depends on the arbitrary standard set- 
by the investigator of the number of correct versus incorrect judgments 
required. Morse required 10 correct judgments out of 12 trials for a 
judge to bo declared discriminative, reasoning that such a ratio of judg- 
ments between equal stimuli could have occurred by chance in slightly 
Joss than 5% of similar repeated trials. 

Many workers have used paired or duo-trio (Chapter 7, Section 
VI,A) tests for panel selection. Tarvcr el al (1959) used a paired test 
for establishing bitterness tolerance levels. Byer and Gray (1953) used 
paired tests with beer samples, and applied x a for determining the con- 
sistency of the judges. In selecting a panel for coffee testing, Harrison 
and Elder (1050) presented candidates with six cups of coffee consisting 
of three sets of pairs over a period of 20 to 30 days. The candidates were 
then ranked in decreasing order of their successes in making the correct 
pairings, and only the top half was used. Bliss (1900) used replicate 
paired tests with each subject. Stability of preferences was used as the 
selection criterion. Lockhart ( .1951) noted that any of the binomial sys- 
tems provides a means for rapidly selecting panel members whose sensi- 
tivities can be described in terms of probability levels. These systems can. 
also be used for cheeking the sensitivities' of the panel on a day-to-day 
or week-to-week basis. 

The most common method of choice has been the triangle test (Chap- 
ter 7, Section VI,B). It was first used by Bengtsson and Helm (1946) 
and Hehn and TroJlc (194CS) for selecting beer tasting panels. Beers of 
known differences were used first in simple tests and later in more diffi- 
cult tests. Only the most sensitive individuals wore used. Data from the 
tests were used to check panel performance. The Quartermaster group 
(Cirardot <U al, 1952) used a triangle test in the first stage of selection. 
Simple tests were used ^irst y but later the tests were of increasing diffi- 
culty. The judges were ranked on the basis of their percentages of correct 
judgments. All judges took about the same number of tests at each level 
of difficulty. Only the ranking near the cut-off point is critical 

Bradley (1955) recommended repeated triangle tests for selecting 
judges. Sequential methods (Chapter IV, Section III) can be reconv 



iiil 

iliili 



mi 

11! 

^ Hi* J J if 



iV {bias J! 

mm 



til 

mm 



286 



(5. LAHOUATOHY STUDIES 



mended because of their efficiency and because they focus attention on 
the risk of accepting poor judges or of rejecting good ones. Using both 
paired and triangle tests, Sehlosberg et al (1954) found that a judges 
relative performance during the first two days of testing "had a fair 
predictive value for his relative over-all performance during the follow- 
ing 20*day period,' 1 This was not true when preference for milk was 
measured, but thai result will bo discussed later (Chapter 6, Section 
TI.F). Their experience was that ability for one panel did not carry over 
to another. Honing (1.048) used the triangle test to select panels for 
distinguishing differences in flavor resulting From time and temperature 
of storage of various products. Amerine (1948) recommended it for se- 
lecting wine panels. Krum (1955) likewise used it, noting that each 
candidate should take the same number of tests. The cut-ofF point was 
determined by the number of panel members required and the precision 
required by the problem. Moscr el al (1950) found one experienced 
judge with an excellent record in testing oil but a poor record in delect- 
ing diacctyl by triangle tests. They attributed this disparity to confusion 
on the part of the subject. However, tin's judge may have been insensitive 
to low concentrations of diacetyl, even though reputed to have a keen 
sense of smell 

Dawson et al (.1963) showed thai for taste thresholds the paired 
comparison resulted in lower thresholds than thfe triangular, and that the 
single-sample procedure was the least sensitive. 

Various methods of scoring have been used jn selecting panels. 
Hcdonic scores were used by Girardot et al (1952). Similar procedures 
have been used or reviewed by Sharp et al (1936), Trout and Sharp 
(1937), Boggs and Hanson (1949), Harrison el: al (1954), and others. 
Used to evaluate performance have been average deviations between 
duplicate scores, the deviation from the score of a control sample intro- 
duced in series, or the deviations of scores between first: and second 
tastings (with the samples coded and presented in different orders). Al- 
though these measure individual reproducibility, they do not relate re- 
producibility with one sample to ability to find differences between 
unlike samples. To rectify this, the correlation coefficient between the 
first score and duplicate scores for a series of samples of varying quality 
may bo used. Bennett et al (1956) used the standard error of the means 
to measure ability to reproduce judgments. Hopkins (1946) calculated 
both correlation coefficients and regression equations to relate each 
judge's assessment to the average of the panel. A range of sensitivity was 
demonstrable and the suitability of individuals for tests could, be evalu- 
ated. The correlation coefficients were much higher for biscuits than for 
dried milk. Moscr et al (1950) likewise calculated the correlation co- 



efficient and regress 
error of regression 
whole panel. 

Overman and jt 
comparison of the a 
the number of time; 
and easy to undorsti 
lion from the mertn, 
lion from an indivuj 
or marked changes 
from his own meani 
lack of critical disci 
low). Since the m<j 
crimination, Ovcrr& 
determining the e<^ 
natc. The variance j 
ability to duplicate*, 
of the consistency t 
differences in uidivi 
for homogeneity of I 
For his panel, also s 
ter 10, Section V,A) 
and demonstrated 
(1957) screened ju 
significance at odd? 
only those were sc 
of 19:1. The quiclj. 
employed. Jn this .*• 
(range within treaj 
(The factor depefg 
a table.) f 
Sawyer (1958 ^ 
pcatabiKly— the ii$ 
nra of the constats 
a point estimate, 
crimination test dfi 
established specif)/ 
repeatability of pt 
portability predict 
class correlation a;j 
selection of panel* 
Simple ranking 



i 



ens attention on 
nes. Using both 
d that a judge's 
i"g "had a fair 
ring the follow, 
c for miJIc was 
pter 6, Section 
I not cany over 
loot panels for 
k! temperature 
nded it for so- 
ting that each 
t-off point was 
i the precision 
i« experienced 
*>rd in deteer- 
Y to confusion 
sen insensitive 

> havfc a I<ccn 

rk the paired 

> and that the 

<*mg panels. 
*r procedures 
at and Sharp 
» and others, 
cms between 
?ample intro- 
and second 
orders). Al- 
ot relate re- 
-es between 
between the 
ying quality 
«f the means 
) calculated 
relate each 
isfHvfty was 
d be cvaJu- 
it.v than for 
elation co- 



ll. Pond Selection and Testing Environment 2 87 
f of of Tl IT**? , CqUa "°"' Uscd fo ' *™ *• standard 

^ i3f *° individunl " " curcs wUh «~go of Si 

^™;;f J crom « »od scores and applied two test,; a 

Z n ^rf^T*! r T ? a W a " d • -mparison of 

uic nuitiDa 01 times a judge duplicated his score. The first brim rnn,vl 

*"d easy to underhand, was preferred for preliminary ^^^^ 

■on from the mean was the stoical nJsnvc cnio A £h deva" 

m maixcci changeability of opinion during the ***** A i ft Jj« / ^ 
( sa^cd Mges on Hk bwh of F ratios. Thmc w) h TSiStS 

ample ranking rfWgrf .cor,, often permits rdativc differentiation 



288 



6, LABORATORY STUDIES 



of individual capabilities hut docs not unsure a specified level of pro- 
ficiency. 

Kramer (1955) recommended choosing judges on the basis of their 
ability to detect differences at a given probability level. His procedure 
involved matching concentrations, and the tables he published should be 
useful whether or not duplicates arc available for all samples. 

Probably because of their extensive use in industry, control charts 
have been used in selecting panels or maintaining level of performance. 
A control chart is a statistical device used principally for the study and 
control of repetitive processes. Such charts arc bated on the theory thai 
variations due to chance occur in a random pattern and that the fre- 
quencies approach those of the binomial distribution. To see whether a 
process is out of control, past data arc plotted on a control chart. If the 
data conform to a pattern of random variation within the control limits, 
the process will be judged as being in control Reliability is indicated by 
the narrowness of spread between control limits. Since pre-established 
standards can he set up, the control chart also measures the validity of 

nlS) g ° S r ° SUlLS ' ^ ba * SiC ***** Foig<mbawm < 1951 > cmd D «»cnn 
Control charts have been recommended by Marcusc (1945 1947) 
Moor 6* al (19S0), Harrison and Elder (1950), Krum (1955), Coole 
(1956), and Tarver and Ellis (19(51). With them, not only an individuals 
performance but that of an entire panel can be held to a given precision 
Harrison el al (1954) defined the efficiency of a panel in terms of the 
probability of the panel's acceptance of definite differences in the 
samples. To eliminate the number of correct selections through chance 
alone, the scores were corrected with the following formula: 

_ IOO(/ i - O 

where S is the percent score corrected for chance expectation, R the raw 
percent score, and C the percent score expected by chance. 

More elaborate mathematical procedures may be used in certain 
cases: multiple-factor analysis, item analysis, discriminate functions, 
product-moment correlation coefficients (Filipcllo, 3957), etc. 

In most cases a simple test using some binomial procedure may be 
used to eliminate insensitive judges. See Amerine et al (1959) for de- 
tailed procedures used for wine panels. Analysis of variance or some 
sequential procedure should be used for more complex situations or to 
maintain the panel at some desired level of performance. 

Variation among 30 judges in scoring scrambled eggs containing vari- 
ous amounts of added primary-taste compounds was "described by Hop. 



//. Panel 

kins (1946). Significant 
Some statistically sigrj 
containing different tcj! 
substances was also ft>< 
erratic as quality deli 
relation between taste . 
anticipated. Quality <g 
sensations as well as ii 
scoring methods used f 
judges (see also Chnpf 
Sensitivity to taste ^ 
dtewimirxition. In mo$\ 
necessary since ahsolui 
hated io perceptual skil 

D. Panel Size 

The number of jud 
cording to the variabili 
Kminary experiment wi 
the number of judges 
significance. As quality 
panel size must b« ir 
tically significant (Rogj 
A good example of thir, 
biscuits, dried eggs, buf 
levels of acceptability^ 
judges , were required | 
is available, however, { 
crimination. In inconft'ii 
surprisingly, that the cf| 
intermediate quality. \ 

Of course, the pane; " 
in difference testing. II 
ing in degree of saltines 
natc sensory difference 
panel comparisons, 62 j 
preferred panels of 30 1 
Helm ( 1946) preferred 
which might influence : 
were believed adequate 
When only three or Ur 
sible to repent the tests 

1 

t 



IL Panel SefocfiUm and Tasting Environment- 



289 



level of pro- 

>asis of their 
lis procedure 
cd should he 

snivel charts 
performance. 
10 study and 
J theory that 
that the fre- 
:o whether a 
chart. If the 
>ntrol limits, 
indicated by 
^-established 
3 validity of 
and Duncan 

1945, 1947), 
955), Cootc 
individual's 
in precision, 
terms of the 
ices in the 
Ugb chancre 



> H the raw 

in certain 
functions,, 

ire may be 
39) for do- 
:e o)' some 
ttions or to 

lining van- 
id hy Mop- 



kins (1946). Significant variation (p 0.01) was observed among judges, 
Some statistically significant discrimination among groups oC samples 
containing different tost substances and among concentrations of those 
substances was also found. Individual scores became progressively more 
erratic as quality deteriorated. Hopkins concluded that no consistent 
relation between taste acuity alone and payability judgments should bo 
anticipated. Quality evaluation includes visual, olfactory, and tactile 
sensations as well as taste sensitivity, and is further conditioned by the 
scoring methods used and the experience and frame of reference of the 
judges (see also Chapter 8). 

Sensitivity to taste or odor appears to he only one factor influencing 
discrimination. In most caws, elaborate tests based on acuity seem wn- 
necessary since absolute sensitivity to the basic tastes is not dosehj re- 
Med to perceptual skills 

D. Panel Size 

The number of judges needed in a given experiment will vary ac- 
cording to the variabilities of the individuals and of the product. A pre- 
liminary experiment will give information from which can be calculated 
the number of judges necessary to secure a given level of statistical 
significance. As quality decreases, variability among judges increases and 
panel size must be increased t<) obtain differences which are statis- 
tically significant (Boggs ami Hanson, 3949; Kefford and Christie, I960). 
A good example of this is found in work by Hopkins (1946, J947) with 
biscuits, dried eggs, butter, dried milk, and bacon. He noted that, at low 
levels of acceptability, discrimination was very erratic, so that more 
judges were required for significance in results. Not enough information, 
is available, however, on the interrelationship of acceptability and dis- 
crimination. In incomplete-block studies, Hanson at al (J 95.1) found, 
surprisingly, that the error of the panel means was greater for samples of 
intermediate quality. 

Of course, the panels must be much larger in preference testing than 
in difference testing. Hopkins (1947) concluded that, with bacon vary- 
ing in degree of saltiness, panels of 35 judges were necessary to discrimi- 
nate sensory differences of 5% with intmpnnel comparisons. For inter- 
panel comparisons, 02 judges would be necessary. Gimrdot at al (1952) 
preferred panels of 30 to 90 in food-development studies. Bengtsson and 
Helm (1946) preferred large panels (50 to 100) in testing for differences 
which might influence future work. For routine control, 10 to 30 judges 
were believed adequate. Krum (1955) found panels of 10 to 30 sufficient. 
When only three or four individuals were available he believed it pos- 
sible to repeat the tests enough times to get a suitable number of results. 




SENSORY ANALYSIS 



Edited hy 



J. R. PIGGOTT 



Department of Bioscience and Biotechnolony, Food Science Division, 
University of Strathdyde, Glaxgon, Scotland, UK 




ELSEVIER APPLIED SCIENCE PUBLISHERS 
LONDON and NEW YORK 




ELSEVIER APPLIED SCIENCR PUBLISHERS LTD 
Ripple Road, Bark in ft, Essex, Bngtnnd 



Sob Distributor in the. USA and Canada 
BLSGVIER SCIENCE PUBLISHING CO., INC. 
52 Vandcrbih Avenue, New York, NY 10017, USA 



Hritish Library Cataloguing in Publication Data 

Sensory analysis of foods. 
I. Food- Sensory evaluation 
1. Piggoit. J. R. 
664'.07 TX546 

ISDN 0*85334-272*5 

WITH Oft TABLES AND 60 ILLUSTRATIONS 

ft BLSKV1KR APPLIED SCIENCE PUBLISHERS LTD 1984 



The selection and presentation of material and the opinions expressed in this publication are 
the sole responsibility of the authors concerned. 

AH rights reserved. No part of this publication may be reproduced, stored in a retrieval system, 
or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, 
or otherwise, without the prior written permission of the publishers, Elsevier Applied Science 
Publishers Ltd, Ripple Road, Barking, Essex, England 

Primed in Gmt Unmin by (tatKnfd (Primers) Ltd, Grcnl Yarmouth 



PREFA^ 



\ 



In 1965* a book which has since occupie 
science of sensory analysis was publishc 
R. M. and Rocsslcr, E. B., Principles t 
(Academic Press, New York). The author 
also hope that it will stimulate further rose* 
can he seen from the rapid growth in the I* 
evaluation of foods and beverages. Since ■ 
.senses has grown; new sensory methods .] 
been improved, both in application ij 
powerful computers arc widely available* 

Reviews of these developments are not | 
in compiling this book was to provide aifj 
knowledge and practice in sensory analysp 
a laboratory manual, but to provide the pi 
review of progress throughout the wort^f 
foundation on which to build his own cif; 

Individual chapters have been contribi) 
their fields, who have surveyed and intcrpr 
chapters are concerned with examinations 
provide a basic understanding of the so 
sensory testing and by which food flavor 
These arc followed by descriptions of specij 
appearance assessment, and by reviews c; 
ranking and scaling methods, and descrip: 
laboratory, a chapter is devoted to conj 
concluded by descriptions of the staj 
descriptive and inferential analysis of send 



250 



H. L. MKISELMAN 




TABLE 4 

HULKS l } OR SCAU: DHSICiN _ .^^|- . 

rAn^alc *rid u7ro.«7^ wordTIg: ^Wdislikc. no not 

b KSry'S.t PoinSu»a ,o modified with modifiers of the root such as vcry^ ; 

o. S 5.1 ASRSS^ - in < hc of 100 many sca,c 1 

points. 



develop his/her own scales only rarely! It is preferable and safer to use-, 
scX which you have used previously with demonstrated success (defined , 
sta 5.caHy) or which have been used and demonstrated by others. The ', 
mo wc 1 known scale in rood research is the nincpo.nl hedonte scale , 
Si r developed by the US Army in the 1940s. It is intcrcsung to note : 
hat thi scale satisfies the five points mentioned above: il is adequately long : 
(nine DotatO it Pousses a neutral point, it uses one root word 
[Ske dSS' tl'uscs -he same modifiers above and below neutral • 

of ~ (Table 0) 

.nd amount (Table 7), both key concepts in Pood attitude research These 
d u Zvide guidelines for selecting four-nine point category scales for 
teSmcepts The four-point scale of frequency (never, somet.mes. often 

TABLE 5 

Till- NINKM'OINT IIICOONIf SCAMi USIil) FOR FOOD 
ACCr-l'TANO! AND FOOD MtKin'.RKNCi: 



'if A ' 



9 Like extremely 

8 Like very much 

7 Like moderately 

6 L'^c slightly 

S Neither like nor dislike 

4 Dislike slightly 

3 Dislike moderately 

2 Dislike very much 

I Dislike extremely 



•::":h' >; 



COMUMBft STUDIES OK 



\ 



<o 



at — 



in 



6 



£ i 



£ 1 r' 



11 



Si 

II! 
1 «n 



254 



H- L. MBISFXMAN 



with questions of obesity mid diet, with question* of i 

oeh.vioilr which wiKS* aeCi ' • inV< "« ' 

would like 10 think il i». I„ ,„,,„„ c ° J ' !T* 01 ^ " ! B»od as w 
2.3, Food Preference 

food acceptance teSg 3 thtrcsc, " USc ? in '^^ory >; 

preference testing (usin :5<] Oo t./TSr, ^ °? mparcd 11 in ■ ^od -t 

(eliminating (he m 1 < f* " 8lS) l ° " """I*** I 

five-point ^(aKfaSTS v , ' S *° CX J lrCme,y ' ™"&°™) «nd « 3 

three surv y cnS w 2 „ ^ My ' <* tc S<>™>- The £ 

•»»l between •^n.S^^r mlKh ^ | 



.31 



Si:; 



CONSUMER STlfDIlffi OF FOOD 

the left-hand side of ihc paper. The question A 
begin with thc'dislikc* or like 1 end of the ratings 
on a list of 45 food items, that the proportion of; 
categories was almost identical (correlation c< 
there were significant differences between the i 
dislike extremely* was placed on the extreme lef 
extremely' was placed in that position. Begin, 
extremely' led to a significantly greater frequcne 
Beginning the scale with like extremely' did n 
increased frequency for like' categories. In pn; 
very small. The correlation between the 45 pair; 
and it is the mean which is used for predictive pvj 
researchers suggested that the scale should berj 
hastened to add that no clear problem rcsullcj 

The issue of preference frequency has bc| 
preference scaling and has been phrased in a -| 
would you like to cat the menu items TVHow o| 
cat the items 7\ 'How often would you like 
'How often would you like to eat the item?*. - : 

Preference frequency scales have been of it[ 
categories of frequency and the other using qiu< 
8). Almost all frequency scales used have had t'i 
verbal-based scales have depended heavily on Irj 
of day. week and month. Two scales have used i 
to day, week or month refcrrcnts (Lcvcrton, 1 94' 
this could represent difficulties in trying to irar 
units. Benson (1958) also used a four-category s 
terms (once a day, week, month, year). Hartmu 
et aL ( 1 967) used identical nine-category scales fj 
month' (plus 'never want'). The QMFCI rcseani 
used a nine-category scale which overlapped grcj 
In some administrations it was extended to *cj 
*oncc a year*. The question which arises thl 
appropriate lime frame for the preference frcq| 
has not been directly addressed, most scales u| 
For most purposes it would appear that items cqj 
would be insignificant, unless very specialist*) 
restaurants, catering) were of interest. 

The QMFCI scale also listed the frequency pjf. 
categories. The category 'every three month: 




MAN 



questions of how much money is '&0 • 
rcparcs wholesome meals for the jiff! 
such things arc asked, as well as W% 
'■ might lose or gain something as a -0$. 

;oplc how often they eat out of S:;^ 
old beverages, etc., might involve '&y 

depending on the memory of the 
cmory will not be as good as we 
:s it has been demonstrated that 
pic information. Beware. :: .y!-.- 



ncfcrcncc has been the one most 
y the Quartermaster Food and 
j in the laic 1 94()s (sec Pcryam and 

d to indicate that a food was 'never 
:volvcd more research than otlicr 

was selected rather than a paircd- 
tcms arc used rather than lists of 

(he rating scale approach and the 
' relative preferences in good 

re the number of scale points and 
1 already been used in laboratory 
:archers compared it in a food 
0 item lists) to a seven-point scale 
slikc extremely' categories) and a 
cly* and 'slightly' categories). The 
:c in test -retest reliability and the 
lest retest reliability (0-96), so the 
:ngth were both adopted, 
e next step in the development of 
toscn by scaling their meaning, so 
'very much' should be the same as 
si/. 

siting scales is positioning on the 
owed a list of foods presented on 



CONSUMER STUDIES Ol' FOOD HABITS 



255 



,u c left-hand side of the paper. The question raised was 'Should the scale 
J w b thc'dislike' or'Hkc' end of the rating scaler. They found, » . .** 
2? list of 45 food items, that the proportion of answers in each o he nine 
categories was almost identical (correlation coefficient . 0-96 . However 
5S were significant differences between the form ol the scale m which 
Sec extremely was placed on the extreme left and the one ,n which like 
Z cmel/ was placed in that position. Beginning the scale with 'dislike 
cx rcmei/ led to a significantly greater frequency of the 'dislike' categories. 
Beg nning the scale with 'like extreme!/ did not produce the analogous 
fnercased frequency for 'like' categories. In practical terms the effects are 
ly small. The correlation between the 45 pairs of food means was 0 997. 
and it is the mean which is used for predictive purposes with these data The 
Searchers suggested that the scale should begin with 'like extremely but 
hastened to add that no clear problem resulted from the reverse. 

The issue of preference frequency has been another focus of food 
preference scaling and has been phrased in a variety o ways: How often 
would you like to eat the menu items? , 'How often would you be wdl.ng to 
cat the item?', 'How often would you like to see the food offered 1 . . .. 
'How often would you like to cat the item?'. 

Preference frequency scales have been of two types, one using verba 
categories of frequency and the other using quantitative categories (Table 
8) Almost all frequency scales used have had four or nine categories. 1 he 
verbal-based scales have depended heavily on the existing temporal system 
of dav week and month Two scales have used the term 'often' m addition 
toJayiw^ 

this could represent difficulties in trying to translate into actual tempora 
units Benson (1958) also used a four-category scale but stuck to temporal 
terms (once a day, week, month, year). Hartmuller (1971) and Kn.ekrehm 
ei at (1967) used identical ninocalcgory scales from 'twice a day 10 once a 
month' (plus 'never want'). The QMFCI research on frequency scales also 
used a nine-category scale which overlapped greatly w,th the one jus I cited 
In some administrations it was extended to 'every three months and to 
•once a year'. The question which arises then is: "What « the most 
appropriate time frame for the preference frequency scale ? his quo ion 
has not been directly addressed, most scales using the month as the unit. 
For most purposes it would appear that items consumed only once per year 
would be insignificant, unless very specialised food services (class A 
restaurants, catering) were of interest. k<1 | e ,.,i 0 
The QMFCI scale also listed the frequency per month ol all verbal scale 
categories. The category 'every three months was rated 0-3 and the 



•mm 



■ft. s * w 



256 H. L. MEI5KLMAN 

TABLE 8 

SCALES OF IW:IKlOtHD FRKQUhNCY 



Kntckrehm ttartmulk* QMFCI Benson Schuvk Lcverton 
rial. U071) {1958) U96I) (1944) 

{1067) 



Often 










* 




Twice a day 


e 




t 








Once a day 






ill 








Every other clay 


if 












Several times per 














week, 15 months 














Twice a week 


♦ 




* 








Once a week 






* 


* 


* 




Every other week 


♦ 




* 








Once a month 






* 








Every 3 months 






* 








Once a year 






■s 








Never/unwilling 














to cat 






<> 








Not familiar 










* 


* 



category 4 oncc a year' was rated ()■ I . This reinforces the use of the month as 
the unit. It also provides both the test respondent and the researcher with a 
quantified scale for analysis and prediction. In some cases (e.g. Knickrchm 
1 967) subjects responded on the frequency scale by listing the number 
of the verbal category. For example, twice a week was coded as 4, The 
potential problem here is that the best respondent is not using the actual 
frequency statement in his answer, whereas in other scales he is. 

A preference scale (Fig, 1 ) developed more recently for the military used 
a quantitative preference frequency scale based on the week and month 
(Meisclman et «/., 1972). The subject was asked how often he would like an 
item in terms of days per week (answer 1, 2, 3, 4, 5, 6 or 7) and weeks per 
month (answer 1 , 2. 3 or 4). While this docs directly ask the preference 
frequency question in quantitative terms, it forces the subject into a 
week month system. If he wants squash 1 3 limes per month, he cannot so 
indicate. Further il assumes that the weekly pattern is repeated. This is also 
the case in some verbal categories scales. A more recently developed survey 
(Fig. 2) (Meisclman and Waterman, 1 978) avoids weekly units and asks for 
preference frequency per month using a scale which permits coding of any 
number from 0 to 31 (actually 39 is possible) days per month. Note again 
that the monthly unit was the unit of choice. 



• :J 
.> 

i 

CONSUMER STUDIES OP VCk 

€ 

The numerical and verbal scales possibly % 
the subject is using numbers in the numeric 
categories in the non-numerical scale. Wh? 
codes for the verbal scale categories, probldi 
attention is then on a number which docs no 
He then begins to use the category scale ofj 
referring them to their referred frequencies 
happen in the hedonic acceptance scale in wb 
without realising its refcrrcnt (extremely go 
One potential advantage of certain quan 
that they can be ratio scales, thai is, scales v 
point. Ratio scales permit statements of rat 
preferred twice as often as y % etc. The frees 
Army Natick Laboratories (Meisclman arr 
scale (from 0 to 39), Both the old QMFC 
Meisclman et at. (1972) are not continuou 
subject is selecting categories rather than t 
The scales discussed so far have been cit % 
frequency scales. Schutt(l965) developed *|j 
Scale), by scaling 18 action statements | 
towards foods. Nine were selected to giv^ 
deviation and mean of the FACT scale and^j 
very similar; the two scales correlate 0-9f 
tendency for the FACT means lo be Ua 
apparently results from slightly lower Fl 
semisolid and liquid foods. 

Van Ritcr (1956) used a scale based on 
vegetables) including scale categories: % ncv< 
of my family dislike the food\ and 'prcpa 
categories arc indicators of factors that i 
preference determination, Whether the; 
preferences themselves is unclear without; 

2.4. Examples of Food Preference Data j 

Although a large amount of food prefer*} 
institutions and commercial organisation 
literature. However, there is a growing bc| 
lap so that many food preference decisi'?; 
One of the largcsl of available data bases | 
Forces which have been collecting food pn 



i 



CONSUMER STUDIES OF FOOD HABITS 



257 



ri*:OOI:n<y 



MFC! Benson Sdnak Lcvcrtoh 
(1958) (?96 /) (/W) 

* * 



* 

* * ♦ * 

* if 

* * * 



nforccs the use of the month as 
ncienl and the researcher with a 
In some cases (e.g. Knickrchm 
:ncy scale by listing the number 
; a week was coded as 4, The 
londcnl is not using the actual 
s in other scales he is. 
e recently Tor the military used 
utscd on the week and month 
ked how often he would like an 
, 3. 4. 5, 6 or 7) and weeks per 
>cs direct ly ask the preference 
it forces the subject into a 
limes per month, he cannot so 
pattern is repealed. This is also 
nore recently developed survey 
voids weekly units and asks for 
c which permits coding of any 
;) days per month. Note again 
:c. 



The numerical and verbal scales possibly reduce to the same thing when 
the subject is using numbers in the numerical scale and using the verbal 
categories in the non-numcrieal scale. When the subject uses numerical 
codes for the verbal scale categories, problems can arise. The focus of his 
attention is then on a number which does not directly represent frequency. 
He then begins to use the category scale of numbers without necessarily 
referring them to their refcrrcnt frequencies. This is similar to what can 
happen in the hedonic acceptance scale in which one begins to use a number 
without realising its refcrrcnt (extremely good, very bad, etc.). 

One potential advantage of certain quantitative scales of frequency is 
that they can be ratio scales, that is, scales with equal intervals and a zero 
point. Ratio scales permit statements of ratios so thai one could say .v is 
preferred twice as often as >\ etc. The frequency scale developed by US 
Army Natick Laboratories (Mcisclman and Waterman, 1978) is such a 
scale (from 0 to 39). Both the old QMFCI scale and the scale used by 
Mcisclman et al. (1972) arc not continuous scries of numbers; hence the 
subject is selecting categories rather than dealing in ratios. 

The scales discussed so far have been cither hedonic scales or preference 
frequency scales. SchuU ( 1 965) developed a food action ratingscalc (FACT 
Scale), by scaling IK action statements representing affective attitudes 
towards foods. Nine were selected to give equal intervals. The standard 
deviation and mean of the FACT scale and the nine-point hedonic scale are 
very similar; the two scales correlate 0-97 Tor food means. The overall 
tendency for the FACT' means to be lower than the hedonic means 
apparently results from slightly lower FACT ratings for desserts and 
semisolid and liquid foods. 

Van Ritcr (1956) used a scale based on home use of foods (specifically 
vegetables) including scale categories: l nevcr served at home', 'one or more 
of my family dislike the food*, and "prepared differently at home*. These 
categories arc indicators of factors thai arc possibly important in food 
preference determination. Whether they arc good measures of the 
preferences themselves is unclear without a more complete evaluation. 

2.4. Examples of Food Preference Data 

Although a large amount of food preference data is collected by various 
institutions and commercial organisations, liltlc of it reaches the open 
literature. However, there is a growing body of data for the investigator to 
tap so that many food preference decisions need not be made intuitively. 
One of the largest of available data bases is that of the United States Armed 
Forces which have been collecting food preference data for almost 40 years. 



366 



H. J. 1 1, MaCFIU AND D. M. H, THOMSON 



role in the filling algorithm. Following Carroll (1972), it is necessary to 
distinguish two modes of analysis. In internal analysis the objective is to 
achieve a consensus configuration of the stimuli based .solely on the 
preference data. In external analysis the aim is to relate preferences to 
physicochcmical measurements using as parsimonious a model as possible 
to lake account of individual differences in scoring patterns. 

3.3.2. Internet Analyses 

The simplest approach to modelling individual differences in preference is 
the vector model proposed by Tucker (I960), The set of stimulus points arc 
embedded in a multidimensional space and each subject is represented by a 
vector in the space. The ordering of the projections of the stimulus points 
on to the vector gives the preference ranking of that subject. The cosine of 
the angle that a vector makes with the dimensions of the space is considered 
to be proportional to the relative importance of that dimension in the 
preference judgement. 

An example from our own experience demonstrates the use of the vector 
model very effectively. The data (unpublished) was generated at the Torry 
Research Station, Aberdeen and we arc grateful to P. Howgale for 
permission to use it. In this study, 48 subjects were asked to rate six types of 
fish or fish product on an hedonic (Pcryam/Pilgrim) scale: I -dislike 
extremely, 9 = like extremely. For brevity Table 4 shows the session means 
for only six subjects, A-F. The complete data was input to the MDPREF 
program (Chang and Carroll, 1968), and the two-dimensional solution, 
which accounts for 85-3 % of the variation, appears in Fig. 9. The subjects 
appear as points on the unit circle and a preference ranking is obtained by 

TABLE 4 

MKAN rftlM-liUliNCT; SCORH WW SIX SUWlUTS ON SIX FISH PRODUCTS 



10 



08 



0-6 



MULTIDIMENSIONAL Si 

rj 

11 

V 
I '■- 

r 
I 
j 

i ' 
t 

i \ 
i i 
i 

I ! 



.SP 



J 
/ 
I 

~f- 

I 
I 
I 
I 
I 
I 
r 
i 
l 
l 

I : 

1 

\ 

I 

t 
I 
I 
J 
I 
f 

I 
I 



•0-66 



022 



Subject 



Stimulus 





White fish in 


White fish 


Scad 


Scad 


Cod miner 


Blue whiting 




parsley sauce 


fitters 


(good) 


(poor) 


fingers 


fingers .. 


A 


7-7 


7-3 


7-4 


60 


7-3 


80 


B 


5 0 


5<2 


62 


3-7 


5-3 


5-0 i 


C 


7-5 


6-6 


5-0 


40 


6-3 


4-5 


D 


7-fi 


6-8 


5-6 


4-7 


60 


5-7 # 


!:i 


6-5 


6-3 


61 


65 


53 


3-9 >; 


K 


5-7 


(vH 


60 


6-6 


72 


4-7 



. 8it ■ 



Fit;. 9. M DPR UP solution displnvinii eanCtmi 
fish ,o parsley sauce; WF, whtaM«E?ft 
m.ncc fineers; BW, blue wttinJSSS^ 

the unit circle' 

drawing a line passing through the ork 
Pcrpcndtculars from each stimulus point «i 
The honzonlal dimension, around which! 
™ c <|nvcnt,onalprcr e rcnccdinicnsion.The 

IcL f, , v th ,he rcformcd P«*«« » f Pi 
*«pubiltty score. However, there arc suF 





>: 
o 
o 



u 
w 

Q £ 

z < 

< g 

w o 

o z 

z § 



u 

CO 

Q 
O 

o 



in 



SI 
Si 



O'i 



X c- 
Z 



as 3 



t/) 

o 

S 

</> 
U4 

S 

w 



S 

3 



■ ,1 

v-> (J 

91 



2 5 * 



o 

pa 
o 



> 
< 



a y , < z n 

'I g-g £3 

^ *c* n r 1 . ■ >• S 



51 sl is 3 s 



I 

o 



ii 

61 

z * 



•2 -vj 



» 0 



S3 



o 
z 

5« 
So 

O 5f 

"5 5- 

sa 



U.-S 

a) -c 

fcfc 

/« Si 



To Mely, Kami* Lisa, Illci 
Vccita, Kajcsh, Sccmn, Sap 



COPYRIGHT © 1984, DY ACADEMIC PRESS, INC, 
ALL RIGHTS RESERVED. 

MO PART OF THLS PimUCATION MAY BR REPRODUCED OR 
TRANSMITTED IN ANY FORM OR BY ANV MEANS, ELECTRONIC 
OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY 
INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT 
PERMISSION IN WRTTTNG PROM THE PUBLISHER, 



ACADEMIC PRESS, INC, 
Orlando, Florid n 338E7 



United Kingdom Edition published by . 
ACADEMIC PRESS, INC. (LONDON) LTD. ( 
24/28 Oval Rond, London NW1 7DX 

Library of Congress Cataloging i» Public* don Data 

Gacula, Maximo C- 

Statistical methods in food and consumer research. 

(Food science and technology) 
Includes index. 

1. Food research—Statistical methods, 2* Consumers- 
Research— Statistical methods. 3. Experimental design, 
I - Singh, Oagbir. II. Title* 
TX367.G33 1985 664*. 0072 84-11170 

ISBN 0-12-272050-4 (alk. paper) 



PRINTED IN THE UNITED STATES OP AMERICA 
84 85 86 87 9 8 7 6 5*321 



,, BltlPF HEV1KW Of TOOLS POR STATISTICAL INHERENCE ' 
0.14) — H " 0 - ' " • * * 

standard normal r.v. 75. thrce ot h er probability dis- 

Stan m statistical t-ta?^^ .^StSS^TS Jw^f—O?) «B«ribi.iio«w 
uibuUom.Thcyj arecalUri h < *-U«J»* dilutions depend on only one 
and the F dfatnbuUon ^J^and x ^ ^ ^ 

parameter, whereas the fJ^Xutions are called the degrees of freedom 
nology, the parameters of t^JJgJS J«o parameters are identified as 
(DF) parameters. For the /' f ^ "V°" t ^« np The percentiles of the t 

?e "numerator" and ^^^t^n J^ 

distribution are gtven « rable ML As the u dislribuUoa Thal IS for 

l Sa?diUtion — 
K denominator OF are given In ^Tablc A-4. ^ 

that 

l// ; «.,.,.«, = F, ..,.„,.„,. 
Thetcareothcrprobabmtyd^bu^ 

E "Led, the population p = tc- ^ ^ 
to be estimated by ap *^J2^ q £«d to estimate population 
example, statistic * the samp* nean, may be ^ 
mean * Also S 2 , the sample vanamA can J u cr> „ ^d an 

variance * ». A static, when t is used I o «um« P ^ Ung 

oMcr. Since an estimator « ^^^^^ an estimator 



\ INTRODUCTION 

the extent of sampling vanation ^JJ^^ 1 } lh cn 0 is -Hod 
estimating is such that the ^.^^SSn of the distribution of $ 
an MbM estimator. fttf). I" this notation, an 

is also called the expend va «c of 0 and ^Jgjj^^ mcan * and varla „cc 

estimator 0 offf is unbiased if h{Q) l\™™tion J*n » and variance c\ 
* -e unbind estimato-o he pom at.on mc*^ ^ 

respectively. That is. EX) " ^'^^Jtheoopolalion standard deviation 
deviation S is not an unbased g^J^*^ 5n addili on to the 
a. In our notation, /:(&) * B » l tn " . whjch mav bc usc d in deciding 
unbiascdncss, such as consistency ^^^J^^ details because 
•PPi^^'^^^JS «sc only the statistically estab- 

An estimator when used to estimate a ^ estimfll «, 

an interval, for example, X - ^ *J • H ,or ' f an imcrval cSlim atc 

the sample standard devia * on '^^; But we can ask; How 

may or may not contain the ^ J"™ft wntai J lh c true value of 

,„K/.r* v -= S7 /n is an estimate of the stand- 
ar<1 deviate or ihc ■Jnd«^ «orrfJT . W £~J£ P ^ 

probability distribution of Y is the fono *'«* - i Suppose that wc can 
P Consider a population with mean , compute X 

list all possible random »™^XE^^S^ 1 - , 
foreachsamplcthusgencrati. g h^'buuonw of ^ for M 



, , BlUflP Of TOOLS FOR ST 

Testing of Hypotheses 

Another area of statistical tnfer< 
theses. The testing of hypotheses con: 

1. Formulation of hypotheses 

2. Collection, analysis of data, 

3. Specification of a decision ru',; 
or rejecting hypotheses | 

The formulation of the hypo* 
proposed experiment. For instance 
whether a process modification b 
researcher proceeds by producing q 
modified process. Let ,i denote Inc.: 
while ,io, the mean texture of .Uv 
hypothesis is written as 

Ho: 

which states that on the average t! 
modified process. The alternative h 

H, 

which states that there is a change 
process. The alternative H„ in (U .■ 
may be less than /.*<> (/« < fo) 'i 
alternative hypothesis is cither j 

The formulation of a one-sided h! 

of the experimenter. ; 
If, instead of in the mean. I 

similarly formulate null and alter 

alternative hypotheses have been 
used to develop statistical dccisu 
Since a statistical decision 
account for sampling variations 
of the null hypothesis on the ba 
rejection of the null hypothesis 
the probability of which ts d« 
B = 0.05 indicates that on the f 
hypothesis 5 times in 100 cases, 
the statistical test; values of 0.t 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXfflBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



